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1  Introduction 

The  Sobol  sequence  [12,  13]  is  one  of  the  standard  quasirandom  sequences,  and  is  widely  used  in  Quasi- 
Monte  Carlo  (QMC)  applications.  QMC  methods  are  a  variant  of  ordinary  Monte  Carlo  (MC)  methods 
that  employ  highly  uniform  quasirandom  numbers  in  place  of  the  pseudorandom  numbers  used  in 
ordinary  Monte  Carlo  (MC)  [3].  QMC  methods  are  now  widely  used  in  scientific  computation,  especially 
in  estimating  integrals  over  multidimensional  domains  and  in  many  different  financial  computations. 
The  error  of  MC  methods  is  asymptotically  where  N  is  the  number  of  samples,  while  QMC 

methods  can  have  an  error  bound  which  behaves  as  well  as  0({logN)^N~^)^  for  5-dimensional  problems. 
While  quasirandom  numbers  do  improve  the  convergence  of  applications  like  numerical  integration,  it  is 
by  no  means  trivial  to  provide  practical  error  estimates  in  QMC  due  to  the  fact  that  the  only  rigorous 
error  bounds,  provided  via  the  Koksma-Hlawka  inequality  [3,  21,  25],  are  very  hard  to  utilize.  In  fact, 
the  common  practice  in  MC  of  using  a  predetermined  error  criterion  as  a  deterministic  termination 
condition,  is  almost  impossible  to  achieve  in  QMC  without  extra  technology.  In  order  to  provide  such 
dynamic  error  estimates  for  QMC  methods,  several  researchers  [21,  25]  proposed  the  use  of  Randomized 
QMC  (RQMC)  methods,  where  randomness  can  be  brought  to  bear  on  quasirandom  sequences  through 
scrambling  and  other  related  randomization  techniques  [4,  5,  9].  The  core  of  RQMC  is  to  find  fast  and 
effective  algorithms  to  randomize  (scramble)  quasirandom  sequences. 

Besides  providing  practical  error  estimates,  another  byproduct  of  scrambled  quasirandom  numbers 
is  that  they  furnish  a  natural  way  to  generate  quasirandom  numbers  in  parallel  and  distributed  appli¬ 
cations.  This  is  because  an  elegant  solution  to  this  problem  is  to  take  a  single  quasirandom  sequence 
and  to  provide  a  differently  scrambled  version  of  this  sequence  to  each  process  requiring  quasirandom 
numbers.  Moreover,  QMC  applications  have  high  degrees  of  parallelism,  can  tolerate  large  latencies, 
and  usually  require  considerable  computational  effort,  making  them  extremely  well  suited  to  parallel, 
distributed,  and  even  Grid-based  computational  environments.  In  these  environments,  a  large  QMC 
problem  is  broken  up  into  many  small  subproblems.  These  subproblems  are  then  scheduled  on  the 
parallel,  distributed,  or  Grid-based  environment.  In  a  more  traditional  instantiation,  these  environ¬ 
ments  are  usually  a  workstation  cluster  connected  by  a  local-area  network,  where  the  computational 
workload  is  cleverly  distributed.  Recently,  peer-to-peer  or  Grid  computing,  the  cooperative  use  of  ge¬ 
ographically  distributed  resources  unified  to  act  as  a  single  powerful  computer,  has  been  investigated 
as  an  appropriate  computational  environment  for  MC  applications  [15,  16].  The  computational  infras¬ 
tructure  developed  for  this  was  based  on  the  existence  of  a  high-quality  tool  for  parallel  pseudorandom 
number  generation,  the  Scalable  Parallel  Random  Number  Generators  (SPRNG)  library  [18,  17].  The 
extension  of  this  technology  to  quasirandom  numbers  would  be  very  useful. 

Unlike  pseudorandom  numbers,  there  are  only  a  few  common  choices  for  quasirandom  number 
generation.  However,  by  scrambling  a  quasirandom  sequence,  one  can  produce  a  family  of  related 
quasirandom  sequences.  Finding  one  or  a  group  of  optimal  quasirandom  sequences  within  this  family 
is  an  interesting  problem,  as  such  optimal  quasirandom  sequences  can  be  quite  useful  for  enhancing  the 
performance  of  ordinary  QMC.  The  process  of  finding  such  optimal  quasirandom  sequences  is  called 
the  derandomization  of  a  randomized  (scrambled)  family  of  quasirandom  sequences.  In  addition  to 
providing  more  quasirandom  sequences  for  QMC,  derandomization  can  help  us  to  improve  the  accuracy 
of  error  estimation  provided  by  RQMC.  This  is  due  to  the  fact  that  one  can  find  a  set  of  optimal 
sequences  within  a  family  of  scrambled  sequence  family,  and  use  sequences  within  this  set  for  error 
estimation. 

The  purpose  of  randomizing  (scrambling)  in  QMC  is  threefold.  Primarily,  it  provides  a  practical 
method  to  obtain  error  estimates  for  QMC  based  by  treating  each  scrambled  sequence  as  a  different  and 
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independent  random  sample  from  a  family  of  randomly  scrambled  quasirandom  numbers  [21].  Thus, 
RQMC  overcomes  the  main  disadvantage  of  QMC  while  maintaining  the  favorable  convergence  rate  of 
QMC.  Secondarily,  scrambling  gives  us  a  simple  and  unified  way  to  generate  quasirandom  numbers  for 
parallel,  distributed,  and  Grid-based  computational  environments.  Finally,  RQMC  provides  many  more 
choices  of  quality  quasirandom  sequences  for  QMC  applications,  and  perhaps  even  optimal  choices  as 
a  result  of  derandomization.  Thus,  a  careful  exploration  of  scrambling  and  derandomization  methods 
coupled  with  library-level  implementations  will  play  a  central  role  in  the  continued  development  and 
use  of  RQMC  techniques. 

In  this  paper,  we  propose  a  new  Sobol  scrambling  algorithm  based  on  the  permutation  of  several 
groups  of  bits  from  the  individual  Sobol  numbers.  Most  of  the  current  scrambling  methods  either 
randomize  a  single  digit  at  each  iteration,  or  randomize  a  group  of  digits  through  linear  operations.  In 
contrast,  our  multiple-digit  scrambling  is  efficient  and  fast  because  it  permutes  small  groups  of  digits. 
We  implemented  this  new  Sobol  scrambling  algorithm  in  the  software  context  of  the  SPRNGlibrary, 
because  SPRNGnot  only  generates  parallel  pseudorandom  numbers,  but  it  also  provides  an  extensible 
object-orient  interfaces  for  merging  scrambled  Sobol  sequences. 

The  remainder  of  this  paper  is  organized  as  follows.  In  §  2,  we  give  a  brief  introduction  to  quasiran¬ 
dom  or  low-discrepancy  sequences  and  the  Sobol  sequence  in  particular.  Our  scrambling  algorithm  is 
described  in  §  3.  In  §  3,  we  also  discus  the  implementation  of  the  algorithm  and  how  it  is  implemented 
in  the  object-oriented  environment  provided  by  the  SPRNGlibrary.  The  results  of  our  experiments  de¬ 
signed  to  verify  the  performance  of  the  algorithm  are  demonstrated  in  §  4.  In  §  5,  we  summarize  our 
conclusions  and  describes  our  future  work. 


_  ^ 

2  The  Sobol  Sequence 

In  this  section,  we  first  discuss  quasirandom  sequences  and  the  conventional  measure  used  for  evaluating 
the  uniformity  of  low-discrepancy  sequences.  Then  we  define  the  Sobol  sequence  and  algorithm  used  to 
generate  it  are  given.  Then,  a  few  general  scrambling  algorithms  are  presented. 


2.1  Low-Discrepancy  Sequences 


Quasirandom  numbers  are  produced  by  a  deterministic  sequence,  and  as  we  will  see  below,  they  are 
often  used  in  quasi-Monte  Carlo  integration.  The  reason  behind  using  deterministic  sequences  is  that 
the  independence  of  random  numbers  plays  a  secondary  role  to  their  uniformity  in  certain  Monte  Carlo 
calculations,  so  sequences  with  better  uniformity  properties  may  lead  to  smaller  errors  in  certain  ap¬ 
plications.  To  quantify  how  uniform  a  given  sequence  is,  we  need  a  well-defined  measure  of  uniformity. 
There  are,  in  fact,  many  ways  to  define  and  measure  uniformity,  we  here  use  the  one  based  on  Nieder- 
reiter’s  development  of  the  topic  [20]. 

Let  P  denote  the  s  dimensional  unit  cube.  An  infinite  sequence  {xn}  in  P  is  called  uniformly 
distributed  if  for  all  measurable  subsets  J  of  P 


1  ^ 

lirriN^ooj^  X]  =  m(  J), 

n=l 


(1) 


holds,  where  XJ  is  the  characteristic  function  of  J,  and  m(J)  is  the  volume  of  J.  Thus  in  the  limit  of 
an  infinite  number  of  points,  every  region  in  P  has  proportionally  the  right  number  of  points.  From 
this  definition  it  follows  that  a  sequence  {xn}  is  uniformly  distributed  if  for  all  Riemann  integrable 
functions,  /,  defined  on  P  it  holds  that 


N 

n=l 


f{x)dx. 


limjsf. 


(2) 


It  follows  from  the  Central  Limit  Theorem  that  a  sequence  of  independent,  identically  distributed  points 
chosen  from  the  uniform  distribution  on  the  interval  is  indeed  a  uniformly  distributed  sequence. 

Practically,  it  is  only  possible  to  deal  with  a  finite  number  of  integration  nodes,  so  it  is  necessary  to 
define  some  measure  of  uniformity  for  finite  point  sets.  Such  a  quantity  is  known  as  the  discrepancy. 
For  a  set  J  C  P  and  a  sequence  of  N  points  {xn}  in  /^,  define  this  measure  of  the  deviation  of  the 
point  set,  {xn}^  from  being  uniform  in  the  set  J  ^  P 

1  ^ 

RN{J)^j^Y.^j{x^)-m{J).  (3) 

n=l 

Various  kinds  of  discrepancy  can  be  defined  then  by  restricting  J  to  a  certain  class  of  subsets  and  taking 
a  norm  of  Rn  over  this  class.  If  E  is  the  set  of  all  rectangular  parallelepipeds  of  P^  then  the  L^o  and 
L2  norms  are  defined  as: 

Djsf  =  supj^E  \Rn{J)  \  ,  (4) 

and 

1 

Tn  =  [  {RN{J{x,y))fdxdy  .  (5) 

J  (y:,y)eP^,Xi<yi 

Here  J(x,y)  indicates  the  rectangular  parallelepiped  with  opposite  corners  at  (x,y).  If  F’*  is  the  set 
of  subrectangles  with  one  corner  at  x  =  0,  then  the  traditional  measures,  the  star  discrepancies  are 
defined  as: 

=  supj^E*  \Rn{J)  \  ,  (6) 

and 

1 

T^=  [  {RN{J{x))fdx  '  .  (7) 

Here  J{x)  is  the  rectangular  parallelepiped  with  one  corner  at  0  and  the  opposite  corner  at  x. 

Equation  5  is  only  a  theoretical  definition  of  the  L2  discrepancy,  [19]  gave  a  practical  equation  for 
calculating  this  L2  discrepancy. 

^  N  N  s  — s  ^  ^ 

Pn  ~  ^  ^  ^  ^  J^(l  Xncixiydfi^i^  ^  XTiiTiicLfi^i^  CLm^i)  ^  ^  ^  ^n^i)  T  12  (8) 

n=l  m=l  i=l  n=l  i=l 

The  Halton  [10],  Sobol  and  Faure  [8]  sequences  are  well  known  and  widely  applied  low-discrepancy 
sequences.  They  are  very  uniform  in  P^  however  they  have  some  very  poor  two-dimensional  projections. 
Fig.  1  shows  a  2-D  projection  of  Sobol  sequence.  The  two  dimensions  are  27th  and  28th.  Clearly  banded 
structures  and  large  gap  exists  in  the  figure.  The  organized  structures  in  Fig.  1  indicates  that  these  two 
dimensions  of  the  Sobol  sequence  are  not  very  uniformly  distributed.  The  organized  shape  and  large 
gaps  in  Fig.  1  may  cause  biased  samples  in  quasi-Monte  Carlo  applications.  Scrambling  is  a  solution 
for  these  poor  low-dimensional  projections. 

2.2  Generating  the  Sobol  Sequence 

Before  discussing  how  to  scramble  the  Sobol  sequence,  let  us  first  look  at  how  an  unscrambled  Sobol 
sequence  is  created  so  that  we  can  identify  the  important  characteristics  of  Sobol  sequence  that  we  will 
use  in  its  scrambling.  The  Sobol  sequence,  as  originally  defined  by  Sobol,  is  generated  from  a  set  of 
special  binary  vectors  of  length  w  bits,  i  =  1,  2,  •  •  •  ,  re,  j  =  1,  2,  •  •  •  ,  d.  These  numbers,  are  called 
direction  numbers.  In  order  to  generate  direction  numbers  for  dimension  j,  we  start  with  a  primitive 
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Figure  1:  A  Poor  2-D  Projection  of  the  Sobol  Sequence 


(irreducible)  polynomial  over  the  finite  field  T2  with  elements  {0,1}.  Let  us  suppose  the  primitive 
polynomial  used  in  generating  dimension  j  of  the  Sobol  sequence  is 

Pj{x)  =  -\-  aix^~^  +  ...  +  aq-ix  +  1.  (9) 

Once  we  have  chosen  this  polynomial,  we  use  its  coefficients  to  define  a  recurrence  for  calculating  vl  , 
the  direction  number  in  dimension  j.  It  is  generated  using  the  following  q-teim  recurrence  relation: 

vj  =  aivl_^  ©  a2vj_^  ©  •  •  •  ©  aq-ivl_g^^  ©  vl_g  ©  {vl_j2^),  (10) 

where  i  >  g,  ©  denotes  the  bitwise  XOR  operation,  and  the  last  term  is  vi-q  shifted  right  q  places.  The 
initial  numbers  •  2^,  'L’2  ’  2^?  '  ’  ’  •  2^  can  be  arbitrary  odd  integers  smaller  than  2,  2^,...,  and  2^, 

respectively.  The  Sobol  sequence  xh  {n  —  ^^2%  bi  G  {0, 1})  in  dimension  j  is  generated  by 

=  biv{  ©  b2V^2  ®  •  •  •  ®  bwvi  (11) 

We  should  use  a  different  primitive  polynomial  to  generate  Sobol  sequence  in  each  dimension. 

Generating  the  Sobol  sequence  as  defined  by  equations  ??  is  tedious  and  time  consuming.  Antonov 
and  Saleev  [1]  provided  another  efficient  way  to  calculate  the  Sobol  sequence.  They  proved  that  taking 

=  9ivi  ®  92V2  ®  •  •  •  (12) 

where  •  •  •  ^3^2^!  is  the  Gray  code  representation  of  n  does  not  affect  the  asymptotic  discrepancy.  For 
A:  =  2,  3,  •  •  • ,  this  shuffles  (permutes)  each  initial  segment  of  length  2^  of  Sobol’s  original  sequence.  We 
remind  the  reader  of  the  following  properties  of  the  Gray  code: 

•  The  Gray  code  for  n  is  obtained  from  the  binary  representation  of  n  using  •  •  •  gsg2gi  =  •  •  •  ® 

•  •  •  64^3^2- 


•  The  Gray  code  for  n  and  the  Gray  code  for  n  +  1  differ  in  only  one  bit.  If  be  is  the  rightmost 
zero-bit  in  the  binary  representation  of  n  (add  a  leading  zero  to  n  if  there  are  no  others),  the  Qc 
is  the  bit  whose  value  changes. 

Using  these  properties,  and  defining  xh  by  equation  12,  we  can  calculate  x^^^^  in  terms  of  xh  as 

=  (13) 

where  he  is  the  rightmost  zero-bit  in  the  binary  representation  of  n.  The  Ant onov-S alee v  method  is 
thus  faster  than  Sobol’s  original  scheme,  and  so  we  will  use  Ant  onov-S  aleev  method  to  implement  the 
original  Sobol  sequence. 

3  Scrambling  the  Sobol  Sequence 

This  section  discusses  about  our  scrambling  algorithm  and  its  implementation  in  SPRNG.  First  we  brieffy 
introduce  some  currently  used  scrambling  algorithms. 

3.1  Current  Scrambling  Methods 

Owen  investigated  a  general  approach  to  scrambling  of  nets  and  sequences  in  [23].  His  scrambling 
scheme  can  be  described  as  follows.  Let  6  >  2  be  an  integer.  Let  5  is  a  mapping  from  interval  [0, 1)  to 

the  6-ary  representation  of  6{A)  G  [0, 1)  determined  in  the  following  way.  Let  A  =  -h  a2b~‘^  H - , 

where  ai,  a2,  •  •  •  are  in  {0, 1,  •  •  •  ,  6  —  1}.  Next,  for  each  possible  value  of  ai,  we  fix  a  permutation  iVai 
of  {0, 1,  ...,6  —  1},  and  define  the  second  6-ary  digit  as  7Vai{a2).  We  can  continue  in  this  way  with  the 
definition  of  the  third  digit,  fourth  digit,  and  so  on,  and  so  obtain  7Tai^a2  (<^3)5  ^ai,a2,a3  (<^4)5  ’  ’  ’  •  Owen’s 
scrambling  scheme,  each  permutation  is  uniformly  distributed  over  the  6!  possible  permutations,  and  the 
permutations  are  mutually  independent.  In  s  dimensions,  we  consider  an  5-tuple  of  6- ary  scramblings 
,65).  Owen  showed  that  if  ,65  are  chosen  as  fully  random  and  mutually  independent, 

then  the  5-dimensional  scramblings  preserve  the  properties  of  (t,  m,  5)-nets  and  (t,  5)-sequences.  A 
consequence  of  this  is  a  sequence  and  its  scrambling  under  this  scheme  have  the  same  asymptotic 
discrepancy. 

Based  on  Owen’s  scrambling  scheme,  many  scrambling  other  algorithms  [22,  14,  2,  11]  that  ran¬ 
domize  a  single  digit  at  a  time  have  been  proposed.  Alternatively,  [6]  proposed  a  multiple  digital 
scrambling  approach  using  a  popular  pseudorandom  number  generator  as  a  scrambler,  namely  the 
Linear  Congruential  Generator  (LOG).  From  now  on  we  call  this  algorithm  linear  scrambling.  The 
procedure  is  described  as  follows: 

•  =  \_Xn  X  2^J ,  is  the  k  most-significant  bits  of  C,  the  number  to  be  scrambled. 

•  ^*  =  ayn{mod  m)  and  m  >  2^  —  1,  is  the  linear  scrambling  applied  to  this  integer. 

•  +  {xn  —  1^),  is  the  final  scrambled  form  derived  from  Xn> 

The  LOG  is  the  key  to  the  success  of  this  scrambling  method.  LCGs  with  both  power-of-two  and  prime 
moduli  are  common  pseudorandom  number  generators.  When  the  modulus  of  an  LOG  is  a  power-of- 
two,  the  implementation  is  cheap  and  fast  due  to  the  fact  that  modular  addition  and  multiplication 
are  just  ordinary  computer  arithmetic  when  the  modulus  corresponds  to  the  size  of  a  computer  word. 
The  disadvantage,  is  in  terms  of  quality,  as  it  is  hard  to  obtain  the  desired  quality  of  pseudorandom 
numbers  when  using  a  power-of-two  as  modulus  LOG.  The  linear  scramblings  thus  used  prime  moduli, 
but  only  primes  with  special  form  such  as  the  Mersenne  or  a  Sophie-Germain  primes. 


3.2  Scrambling  via  Permutation 

Owen’s  scrambling  scheme  randomizes  each  6-ary  number  an  by  uniformly  choosing  permutations 
7rai,a2,  - ,an-i  And  linear  scrambling  produce  randomization  through  multiplication 

with  a  pseudorandom  integer.  Our  new  algorithm  can  be  regarded  as  a  variant  of  Owen’s  scrambling 
scheme.  It  uses  a  random  number  to  choose  a  permutation  randomly  and  uniformly,  then  the  permu¬ 
tation  is  applied  to  a  certain  number  of  digits  for  scrambling.  The  algorithm  is  described  in  detail  as 
the  follows. 

Let  denote  the  set  of  all  permutation  of  the  numbers  {0, 1,  •  •  •  ,  re  — 1},  obviously  the  cardinality  of 

is  w\.  A  bijection,  ,is  defined  as  :  {0, 1,  •  •  •  ,  re!  — 1}  ^  so  that  for  any  a  G  {0, 1,  •  •  •  ,  re!  — 1}, 
(j)^{a)  is  a  permutation  of  the  numbers  {0, 1,  •  •  •  ,re  —  1}.  To  simplify  notation,  we  use  (j)^  to  denote 
(/)^(a).  Assume  that  xh  is  the  Sobol  number  in  the  jth  dimension  corresponding  to  n  as  used  in  §  2.2. 
Then  xi,  —  go  x  (2^)^  +  gi  x  (2^)^  +  ^2  x  (2^)^+,  •  •  • ,  where  A;  >  1.  If  is  the  nth  random  number 
from  random  number  sequence,  j,  and  G  {0, 1,  2,  •  •  •  ,  re!  —  1},  then  the  scrambled  Sobol  number  is 


X: 


j*  _ 


—  Ado)  X  (2^)^  -h  (^i  Adi)  X  (2^)^  -h  (^i  {dA  X  (2^)"^-h, 


k\l 


ik\2 


(14) 


The  idea  behind  this  algorithm  is  to  replace  each  fc-bit  digit  in  by  the  permutation  We 

represent  the  Sobol  number  in  binary  because  Sobol  sequence  generation  is  done  with  operations  in 
the  finite  field  ^2-  The  algorithm  can  be  easily  modified  to  handle  other  low-discrepancy  sequences 
represented  in  bases  other  than  2,  e.g.  the  prime  number  based  Halton  sequence. 

One  problem  in  this  algorithm  is  that  for  each  iteration  we  need  to  permute  multiple  groups  of 
digits  to  scramble  all  the  digits.  Since  the  first  k  significant  bits  play  a  key  role  in  deciding  the  value 
of  the  number,  we  can  get  by  with  only  significant  the  most  significant  k  bits,  i.e.  given  xh^  there  is  an 
integer  m  >  0  which  satisfies  x  (2^)^  +  diX  (2^)^  +  ^2  x  (2^)^-h,  •  •  •  ,  +gm  x  (2^)^  and  gm  A  0* 

The  scrambled  Sobol  number  is  thus 


xi*  =  50  X  (2'=)°  +  51  X  (2'')^+,  x  (2'')—^  +  ^  (gm)  x  (2'^) 


ik\l 


k\m—l 


(15) 


The  algorithm  can  be  further  extended  to  replace  the  first  p  significant  fc-bit  digits  through  permu¬ 
tations,  m  >  p  >  1.  Algorithm  1  illustrates  our  Sobol  scrambling  algorithm. 


1.  Generate  the  nth  Sobol  number  in  dimension  j\  xA) 

2.  Disassemble  xii  into  {50, 5i) '  •  •  ,5m}; 

3.  Get  a  random  number  rnj  from  a  pseudorandom  number  generator; 

4.  Choose  a  permutation  .  from  the  permutation  set  dl2k] 

5.  Replace  ^dm\  with  ^dm—p^Arn^ji.dm—p+l)^'''  5 

6.  Reassemble  the  Sobol  number  as 

xL  ^  50  X  (2^)0+,  •  •  •  ,  +5m-p  X  (2^)— P  +  (5m-p+l)  X  (2^)— P+1+,  .  .  .  ,  +<^^^  .  (^^)  X  (2^)-; 

Algorithm  1:  Our  Algorithm  for  Scrambling  the  Sobol  Sequence 


3.3  Implementing  the  Scrambling  inSPRNG 

The  algorithm  1  described  in  the  previous  section  was  implemented  using  the  software  infrastructure 
provide  by  the  SPRNGlibrary.  The  SPRNGlibrary  was  designed  to  support  parallel  Monte  Carlo  appli¬ 
cations  on  scalable  and  distributed  architectures.  It  contains  most  of  the  important  pseudorandom 
number  generators,  and  many  tools  for  statistical  testing  of  pseudorandom  number  generators.  Fig.  2 
illustrates  the  software  structure  of  SPRNGusing  UML,  the  Unified  MModeling  language.  The  directory 
SRC  is  the  major  directory,  which  contains  all  the  core  implementations  of  the  SPRNGlibrary.  Codes 


Figure  2:  The  Software  Structure  of  SPRNG 


for  the  different  pseudorandom  number  generators  are  within  the  subdirectory  SRC.  SRC  also  includes 
some  important  utility  tools  like  a  prime  number  generator.  Directory  TEST  is  the  directory  containing 
different  kinds  of  statistical  tests  for  random  number  generators,  including  the  collision  test,  coupon  col¬ 
lector’s  test,  equidistribution  test.  Directory  EXAMPLES  includes  not  only  applications  for  demonstrating 
the  usage  of  the  SPRNGlibrary,  but  also  a  few  driver  programs  for  testing  and  debugging  purposes. 

The  key  class  in  SPRNGis  the  Sprng  class,  which  is  the  parent  class  of  all  the  other  pseudorandom 
number  generator  classes.  In  other  words,  as  long  as  the  pseudorandom  number  generator  class  inherits 
and  implements  the  Sprng  class’s  interface,  it  can  be  merged  into  the  SPRNGlibrary  seamlessly.  For  this 
reason,  our  Sobol  sequence  generator  class  Sobol  inherits  the  class  Sprng,  as  demonstrated  in  Fig.  3. 
Considering  that  the  low-discrepancy  sequences  are  widely  used  in  high  dimensional  integration,  we 
extended  the  interface  by  adding  a  new  class  function  get_rn_vector.  The  vector  value  returned  by 
this  function  is  a  point  in  P. 

Scrambler  is  a  virtual  class  which  defines  the  general  interface  for  all  the  scrambler  classes.  LinearScrambler 
is  an  implementation  of  the  scrambling  algorithm  in  [6].  PermutationScrambler  implements  the  algo¬ 
rithm  1. 

PermuationFactory  is  a  virtual  class  to  generate  the  permutation  used  in  step  4  of  algorithm  1. 

Two  child  classes  of  PermuationFactory  are  implemented  to  provide  real  permutations  for  an  input 
random  number  StaticFactory  will  generate  and  store  all  the  n\  permutations  for  {0, 1,  •  •  •  ,  n—  1} 
at  initialization.  For  any  integer  n!  —  1  >  r  >  0,  StaticFactory  retrieves  the  rth  permutation  and 
returns  it. 

Except  for  the  cost  of  initializing  all  of  the  n\  permutations,  StaticFactory  is  quite  efficient,  but 
clearly  the  n  used  must  be  confined  to  a  small  values.  For  the  Sobol  sequence,  2^!  and  2^!  are  already 
big  enough  for  scrambling  the  sequences  as  shown  by  the  empirical  evidence  in  §  4. 

Unlike  StaticeFactory,  DynamicFactory  generates  one  permutation  according  to  the  value  of 
the  random  number  Vnj  at  the  time  the  class’s  function  is  called,  not  when  the  class  is  initialized. 
Theoretically  it  can  generate  any  permutation  of  arbitrarily  large  D-u;.  In  [24],  a  few  fast  algorithms  to 
randomly  generate  permutations  were  introduced.  DynamicFactory  uses  the  algorithm  published  by 
R.  Durstenfeld  [7]  to  randomly  generate  permutations. 


I 

I 


I 
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Figure  3:  Sobol  in  SPRNG 


CPU 

Memory 

Compiler 

os 

SPRNG 

Pentium(R)  D  3.00  GHz 

2GB 

gee  4.1.1 

Linux  2.6. 18-8. el5 

version  4.0 

Table  1:  Experiment  Configurations 


4  Computational  Experiments 

We  designed  a  group  of  experiments  to  verify  the  performance  of  our  scrambling  algorithm.  The  tests 
were  executed  on  a  Linux  PC.  The  software  and  hardware  configurations  of  the  experiments  are  listed 
in  table  1. 

In  the  experiments,  we  used  the  LCG  of  SPRNGto  generate  pseudorandom  numbers  for  picking  a 
permutation  from  The  experiments  comprise  two  types:  2-D  projections  tests  (to  visually  verify 
scrambled  points)  and  the  computation  of  the  L2  discrepancy  test,  where  5  is  used  to  calculate  it.  Two 
parameters  are  critical  in  our  scrambling  algorithm,  one  is  the  number  of  bits  to  be  replaced  at  a  time 
(k  in  algorithm  1),  and  the  other  is  the  number  of  A;-bit  digits  to  be  replaced  in  one  iteration  (p  in 
Alg.  1).  We  use  the  notation  k  x  p  to  denote  these  two  parameters.  For  example,  2x8  means  eight 
2-bit  digits  (from  most-significant  bits  to  the  least)  will  be  replaced  for  each  Sobol  number. 

4.1  2-D  Projections 

Fig.  1  demonstrates  the  banded  structure  and  large  gaps  in  the  2-D  projection  of  27th  and  28th  di¬ 
mensions  of  the  Sobol  sequence.  This  illustrates  the  correlated  relationship  between  those  dimensions. 
Our  scrambled  Sobol  sequence  successfully  breaks  such  such  structures,  as  demonstrated  in  Fig.  4  and 
Fig.  5.  In  Fig.  1,  4096  points  were  generated,  and  the  27th  and  28th  dimensions  were  projected  on  the 
plane  in  Fig.  4  and  Fig.  5.  In  the  left  figure  of  Fig.  4,  although  most  of  the  banded  structures  have  been 
destroyed,  a  few  still  exist  in  the  center,  left-bottom  and  right-top  corner.  The  reason  for  this  is  that 
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Figure  4:  A  2-D  Projection  of  Scrambled  Sobol  Sequence.  Left:  2x1;  Middle:  2x8;  Right:  2  x  15 
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Figure  5:  A  2-D  Projection  of  Scrambled  Sobol  Sequence.  Left:  3x1;  Middle:  3x5;  Right:  3  x  10 


only  randomizing  the  2  most- significant  bits  could  not  effectively  scramble  the  Sobol  sequence.  This  is 
verified  by  the  middle  and  right  graphs  of  Fig.  4.  In  the  middle  graph  of  Fig.  4,  the  banded  structures 
have  totally  disappeared  after  half  of  bits  were  scrambled.  But  it  is  enough  to  randomize  three  bits 
at  one  time  to  destroy  the  correlations  between  the  27th  and  28th  dimensions  as  demonstrated  in  left 
graph  of  Fig.  5. 

4.2  The  L2  Discrepancy 

This  section  explores  the  empirical  relationship  between  number  of  points  and  L2  discrepancy  in  terms 
of  our  scrambling  algorithm.  We  plot  the  logarithm  of  both  of  number  of  points  and  of  L2  discrepancy, 
so  that  the  two  values  produce  a  linear  relation  in  the  graph  if  the  L2  discrepancy  obeys  a  power-law 
relationship  with  the  number  of  points.  In  that  case,  the  slope  in  the  log- log  plot  is  the  exponent.  All 
of  the  tests  were  done  on  two  Sobol  sequences:  one  is  10-dimensional  and  the  other  is  100-dimensional. 
Fig.  6  illustrates  the  linear  relation  between  L2  discrepancy  and  number  of  points  for  the  10-  and 
100-dimensional  Sobol  sequences.  We  scrambled  the  Sobol  sequences  with  different  parameter  configu¬ 
rations,  and  for  each  parameter  configuration,  we  generated  four  scrambled  streams.  There  were  16384 
points  created  for  each  scrambled  or  unscrambled  Sobol  sequence.  The  interval  [1,16384]  is  equally 
divided  into  32  subintervals,  and  the  L2  discrepancy  is  calculated  on  those  boundary  points  of  the  32 
subintervals  that  are  512, 1024,  •  •  •  ,  16384. 

Fig.  7  and  Fig.  8  are  for  the  scrambled  10-dimensional  Sobol  sequences  with  8  groups  and  16  groups 
of  2-bit  permutations.  Compared  with  the  left  graph  of  Fig.  6  (the  unscrambled  10-dimensional  Sobol 
sequence),  all  the  eight  graphs  have  a  quite  similar  linear  relationship.  The  slopes  of  the  log-log  plots 
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Figure  6:  L2  Discrepancy  of  Sobol  Sequences.  Left:  lO-Dimension  Sobol;  Right:  lOO-Dimension  Sobol 


Dimensions 

Unscrambled 

2x8 

2  X  15 

3x5 

3  X  10 

10 

0.005999 

0.024497 

0.029746 

0.026246 

0.031496 

100 

0.055993 

0.302454 

0.351697 

0.314202 

0.350697 

Table  2:  CPU  Time  for  Generating  16384  Scrambled  and  Unscrambled  Sobol  Points 
are  close  to  —0.5. 

Similarly,  Fig.  9  and  Fig.  10  are  for  scrambled  10-dimensional  Sobol  sequences  with  5  and  10  groups 
of  3-bit  permutations.  The  slopes  of  the  log-log  plots  are  also  close  to  —0.5,  but  a  little  bigger  than 
—0.5.  And  the  result  of  5  groups  scrambling  is  very  similar  to  10  groups  scrambling. 

The  next  four  figures  are  all  for  scrambled  100-dimensional  Sobol  sequences.  Fig.  11  and  Fig.  12  are 
for  8  groups  and  15  groups  of  2-bit  permutations.  Fig.  13  and  Fig  14  are  for  5  groups  and  10  groups  3-bit 
permutations.  Compared  with  right  graph  of  Fig.  6  (the  unscrambled  100-dimensional  Sobol  sequence), 
all  the  8  streams  of  2-bit  scramblings  (Fig.  11  and  Fig.  12)  have  a  very  similar  linear  relationship  to 
that  of  original  Sobol  sequence  -  the  slope  values  are  close  to  —1.  But  for  3-bit  scramblings  (Fig.  13 
and  Fig.  14),  only  two  of  eight  streams  have  a  linear  relationship  similar  to  that  of  the  unscrambled 
Sobol  sequence  (right  graph  of  Fig.  6).  To  further  investigate  this,  we  put  all  of  the  L2  discrepancy  data 
of  3-bit  scramblings  in  Fig  14  together  with  that  of  the  unscrambled  Sobol  sequence  into  Fig.  15,  and 
we  used  the  number  of  points,  not  the  logarithm,  on  the  x-axis  of  Fig.  15.  All  the  scrambled  sequences 
have  lower  discrepancy  than  the  unscrambled  sequence  does.  The  scrambled  sequence  (stream  3  in 
Fig.  15  and  Fig.  14)  whose  linear  relation  is  similar  to  that  of  unscrambled  sequence  has  a  more  smooth 
shape  than  other  scrambled  sequences  do. 

Table  2  lists  the  time  used  for  generating  those  scrambled  and  unscrambled  Sobol  sequences  in  units 
of  seconds.  Clearly  our  scrambling  algorithm  is  about  4  to  6  times  slower  than  the  unscrambled  Sobol 
sequence.  But  time  used  for  generating  the  scrambled  100-dimension  Sobol  sequence  is  almost  10  times 
that  of  generating  the  scrambled  10-dimension  Sobol  sequences.  This  means  that  our  algorithm  appears 
to  scale  linearly  with  dimension.  We  also  want  to  mention  that  we  did  not  optimize  the  implementation 
of  our  algorithm. 

5  Conclusions 

We  provide  an  algorithm  to  effectively  randomize  the  Sobol  sequence  vi  multi-bit  permutation.  The 
advantage  of  this  algorithm  is  that  randomization  and  scrambling  speed  can  be  balanced  through  setting 
different  parameter  configurations  -  k  and  p.  Experimental  results  verify  that  our  algorithm  not  only 


Figure  7:  L2  Discrepancy  of  Scrambled  10-Dimension  Sobol  2x8 


Figure  8:  L2  Discrepancy  of  the  Scrambled  10-Dimensional  Sobol  2  x  15 


Figure  9:  L2  Discrepancy  of  the  Scrambled  10-Dimensional  Sobol  3x5 


Figure  10:  L2  Discrepancy  of  the  Scrambled  10-Dimensional  Sobol  3  x  10 


Figure  11:  L2  Discrepancy  of  the  Scrambled  100-Dimensional  Sobol  2 


Figure  12:  L2  Discrepancy  of  the  Scrambled  100-Dimensional  Sobol  2  x  15 


Figure  13:  L2  Discrepancy  of  the  Scrambled  100-Dimensional  Sobol  3x5 


Figure  14:  L2  Discrepancy  of  the  Scrambled  100-Dimensional  Sobol  3  x  10 


Figure  15:  L2  Discrepancy  of  the  Unscrambled  and  All  the  Scrambled  (3  x  10)  100-Dimensional  Sobol 
Sequences 

effectively  breaks  the  correlations  between  dimensions,  but  also  has  a  low  discrepancy  similar  to  the 
unscrambled  Sobol  sequence.  While  our  method  is  not  strictly  an  Owen  scrambling,  this  confirms  that 
it  behaves  as  such. 

With  certain  modifications,  the  scrambling  algorithm  can  also  be  applied  to  other  low-discrepancy 
sequences  such  as  the  Halton  and  Faure  sequences.  One  problem  for  applying  the  algorithm  to  other 
quasirandom  sequences  such  as  the  Halton  sequence,  is  how  to  generate  randomly  and  efficiently  the 
large  number  permutations  required  for  large  bases.  It  is  also  important  to  point  out  that  our  current 
implementation  has  not  been  optimized.  We  plan  to  do  this  in  the  future.  But  we  also  need  to  develop 
a  measure  to  evaluate  how  the  algorithm’s  randomization  process  effects  the  discrepancy  properties  of 
the  sequences,  or  in  other  words,  how  to  derandomize  the  Sobol  sequences  scrambled  by  our  algorithm. 
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